Extraction and clustering of arguing expressions in contentious text

نویسندگان

Amine Trabelsi

Osmar R. Zaïane

چکیده

This work proposes an unsupervised method intended to enhance the quality of opinion mining in contentious text. It presents a Joint Topic Viewpoint (JTV) probabilistic model to analyse the underlying divergent arguing expressions that may be present in a collection of contentious documents. The conceived JTV has the potential of automatically carrying the tasks of extracting associated terms denoting an arguing expression, according to the hidden topics it discusses and the embedded viewpoint it voices. Furthermore, JTV’s structure enables the unsupervised grouping of obtained arguing expressions according to their viewpoints, using a proposed constrained clustering algorithm which is an adapted version of the constrained k-means clustering (COP-KMEANS). Experiments are conducted on three types of contentious documents (polls, online debates and editorials), through six different contentious datasets. Quantitative evaluations of the topic modeling output, as well as the constrained clustering results show the effectiveness of the proposed method to fit the data and generate distinctive patterns of arguing expressions. Moreover, it empirically demonstrates a better clustering of arguing expressions over state-of-the art and baseline methods. The qualitative analysis highlights the coherence of clustered arguing expressions of the same viewpoint and the divergence of opposing ones.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Topic Viewpoint Model for Contention Analysis

This work suggests a fine-grained mining of different types of contentious documents, towards a summarization of contention issues. We propose a Joint Topic Viewpoint model (JTV) for the unsupervised identification and the clustering of arguing expressions according to the latent topics they discuss and the implicit viewpoints they voice. A set of experiments is conducted on three type of conte...

متن کامل

Perceptual Organization for Text Extraction in Natural Scenes

The automated understanding of textual information in natural scenes is an important problem to solve for the Computer Vision and Document Analysis community. In this Thesis we approach the problem of text detection and extraction from an anthropocentric point of view, arguing that the Gestalt grouping laws, as a primary process in the human vision system, is something inherent in the complex c...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Corefrence resolution with deep learning in the Persian Labnguage

Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...

متن کامل

Comparing k-means clusters on parallel Persian-English corpus

This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Data Knowl. Eng.

دوره 100 شماره

صفحات -

تاریخ انتشار 2015

Extraction and clustering of arguing expressions in contentious text

نویسندگان

چکیده

منابع مشابه

A Joint Topic Viewpoint Model for Contention Analysis

Perceptual Organization for Text Extraction in Natural Scenes

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Corefrence resolution with deep learning in the Persian Labnguage

Comparing k-means clusters on parallel Persian-English corpus

عنوان ژورنال:

اشتراک گذاری